{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1. Bayesian optimization\n",
    "*Deep Bayes summer school, 2018*\n",
    "\n",
    "*A. Zaytsev, Y. Kapushev*\n",
    "\n",
    "Content\n",
    "1. Bayesian optimization overview\n",
    "2. Overview of libraries\n",
    "3. One dimensional example\n",
    "4. AutoML: optimization of hyperparameters for machine learning model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# When Bayesian optimization?\n",
    "\n",
    "* Optimization of \"heavy\" functions \n",
    "* The target function is a blackbox, typically noisy, while smooth\n",
    "\n",
    "\n",
    "* Construction a regression model using available data\n",
    "* Take into account uncertainty of the regression model\n",
    "* Gaussian process regression is OK"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Optimization workflow:\n",
    "1. Construct a regression model $\\hat{f}(x)$ of a function $f(x)$ using the sample $D = \\{(x_i, f(x_i))\\}_{i = 1}^n$\n",
    "2. Select a new point that maximize an acquisition function\n",
    "$$\n",
    "x_{new} = \\arg\\max\\limits_x a(x)\n",
    "$$\n",
    "3. Calculate $f(x_{new})$ at the new point.\n",
    "4. Add the pair $(x_{new}, f(x_{new}))$ to the sample $D$.\n",
    "5. Update the model $\\hat{f}(x)$ and go to step 2."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Examples of the acquisation functions \n",
    "\n",
    "#### Upper confidence bound (UCB) \n",
    "\n",
    "$$\n",
    "UСB(x) = \\hat{f}(x) + \\beta \\hat{\\sigma}(x),\n",
    "$$\n",
    "$\\hat{f}(x), \\hat{\\sigma}(x)$ - mean and standard deviation of the Gaussian process regression model at $x$.\n",
    "\n",
    "#### Expected Improvement (EI) \n",
    "\n",
    "$$\n",
    "EI(x) = \\mathbb{E}_{p(\\hat{f})} \\max(0, f_{min} - \\hat{f}(x)). \n",
    "$$\n",
    "\n",
    "\n",
    "Usually we use logarithm of EI."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"EI_vs_logEI.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2. Bayesian optimization libraries:\n",
    "\n",
    "| Library        | #commits           | #stars | #last commit |\n",
    "| ------------- | -----:| -----:| -----:|\n",
    "| hyperopt      | 950   | 2275  | 02.07.2018 | \n",
    "| BayesOpt      | 515   | 157   | 30.03.2018 | \n",
    "| GPyOpt        | 463   | 303   | 26.06.2018 |\n",
    "| GPflowOpt     | 407   | 107   | 16.04.2018 | \n",
    "| pyGPGO        | 273   | 78    | 07.11.2017 | \n",
    "\n",
    "More libraries for Matlab (SUMO) and other languages.\n",
    "\n",
    "*Actually it is not hard to write your own library on top of your favorite Gaussian Process Regression library.*\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Additional libraries\n",
    "\n",
    "We need libraries for\n",
    "* Gaussian process regression **GPy** (see previous seminar)\n",
    "* Gaussian process regression-based Bayesian optimization **GPyOpt**\n",
    "\n",
    "See more use cases of **GPyOpt** at http://nbviewer.jupyter.org/github/SheffieldML/GPyOpt/blob/master/manual/index.ipynb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To install GPyOpt run the following line"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install emcee\n",
    "!pip install GPyOpt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import GPy\n",
    "import GPyOpt\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "# auxiliary functions\n",
    "import utility\n",
    "% matplotlib inline\n",
    "from IPython.display import clear_output\n",
    "from tqdm import trange\n",
    "\n",
    "# emcee sampler is required to run Entropy search in GPyOpt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import utility"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 3. One dimensional example\n",
    "\n",
    "We demonstrate concepts using one-dimensional example.\n",
    "\n",
    "Let us consider Bayesian optimization for one-dimensional function **Forrester**:\n",
    "$$\n",
    "f(x) = (6 x - 2)^2 \\sin(12 x - 4).\n",
    "$$\n",
    "\n",
    "The optimization problem is the following:\n",
    "$$\n",
    "f(x) \\rightarrow \\min, x \\in [0, 1].\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# we can load it from GPyOpt library\n",
    "forrester_function = GPyOpt.objective_examples.experiments1d.forrester()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "forrester_function.f(np.array([0.5]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "forrester_function.bounds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "forrester_function.plot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Select the region where we search the optimum\n",
    "space = [{'name': 'x', 'type': 'continuous', 'domain': (0, 1)}]\n",
    "design_region = GPyOpt.Design_space(space=space)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Select the initial design\n",
    "from numpy.random import seed # fixed seed\n",
    "seed(123456)\n",
    "\n",
    "initial_sample_size = 5\n",
    "initial_design = GPyOpt.experiment_design.initial_design('random', design_region, initial_sample_size)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "uniform_dense_grid = np.linspace(0, 1, 200).reshape(-1, 1)\n",
    "\n",
    "# plot function: curve and values at the initial design points\n",
    "utility.plot_one_dimensional_function(forrester_function, \n",
    "                                      uniform_dense_grid, \n",
    "                                      initial_design)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## We defined the problem - now we create a machine to solve it\n",
    "\n",
    "1. A black box that evaluates the target function\n",
    "2. What kind of the regression model we need\n",
    "3. How do we optimize the acquisition function\n",
    "4. What kind of the acquisition function we use\n",
    "5. Should we use optimizer in batch or continuous mode?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# The target function\n",
    "objective = GPyOpt.core.task.SingleObjective(forrester_function.f)\n",
    "\n",
    "# Model type\n",
    "gp_model = GPyOpt.models.GPModel(exact_feval=True, optimize_restarts=10, verbose=False) \n",
    "# exact_feval - are evaluations exact?\n",
    "# optimize_restarts - number of restarts at each step\n",
    "# verbose - how verbose we are\n",
    "\n",
    "# Optimizer of the acquisition function, the default is 'LBFGS'\n",
    "aquisition_optimizer = GPyOpt.optimization.AcquisitionOptimizer(design_region)\n",
    "\n",
    "# The acquisition function is expected improvement\n",
    "acquisition_function = GPyOpt.acquisitions.AcquisitionEI(gp_model, design_region, optimizer=aquisition_optimizer)\n",
    "\n",
    "# How we collect the data\n",
    "evaluator = GPyOpt.core.evaluators.Sequential(acquisition_function)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Now we are ready to construct the machine\n",
    "bayesian_optimizer = GPyOpt.methods.ModularBayesianOptimization(gp_model, design_region, objective, \n",
    "                                                acquisition_function, evaluator, initial_design)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run the first six iterations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Stopping criteria\n",
    "max_time = None \n",
    "max_number_of_iterations = 5\n",
    "tolerance = 1e-8 # distance between consequitive observations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run five iterations\n",
    "for iteration in range(max_number_of_iterations):\n",
    "    bayesian_optimizer.run_optimization(max_iter=1, max_time=max_time, \n",
    "                                        eps=tolerance, verbosity=False) \n",
    "  \n",
    "    bayesian_optimizer.plot_acquisition()\n",
    "    clear_output(wait=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Now we run more iterations - 25"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "bayesian_optimizer = GPyOpt.methods.ModularBayesianOptimization(gp_model, design_region, objective, \n",
    "                                                acquisition_function, evaluator, initial_design)\n",
    "\n",
    "max_number_of_iterations = 25\n",
    "bayesian_optimizer.run_optimization(max_iter=max_number_of_iterations, max_time=max_time, \n",
    "                                    eps=tolerance, verbosity=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Analyze problems"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bayesian_optimizer.plot_acquisition()\n",
    "bayesian_optimizer.plot_convergence()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Obtained xmin:  %.3f, real xmin:  %.3f (approximate)' % (bayesian_optimizer.x_opt, forrester_function.min))\n",
    "print('Obtained fmin: %.3f, real fmin: %.3f (approximate)' % (bayesian_optimizer.fx_opt, forrester_function.fmin))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Problem 1.1\n",
    "\n",
    "Compare various acquisition functions EI, UCB and PI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Your code below. Use EI, UCB and PI acquisition functions\n",
    "number_of_runs = 10\n",
    "obtained_targets_history_EI = []\n",
    "for index in trange(number_of_runs):\n",
    "    seed(index)\n",
    "    #### Your code here ####\n",
    "    \n",
    "    ########################\n",
    "\n",
    "# your code below for other criteria\n",
    "obtained_targets_history_UCB = []\n",
    "for index in trange(number_of_runs):\n",
    "    seed(index)\n",
    "    #### Your code here ####\n",
    "    \n",
    "    ########################\n",
    "obtained_targets_history_PI = []\n",
    "for index in trange(number_of_runs):\n",
    "    seed(index)\n",
    "    #### Your code here ####\n",
    "    \n",
    "    ########################"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "initial_sample_size = 5\n",
    "obtained_targets_history_EI = np.array(obtained_targets_history_EI)\n",
    "obtained_targets_history_UCB = np.array(obtained_targets_history_UCB)\n",
    "obtained_targets_history_PI = np.array(obtained_targets_history_PI)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.shape(obtained_targets_history_EI)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "obtained_targets_history_EI_cut = obtained_targets_history_EI[:, initial_sample_size:]\n",
    "obtained_targets_history_UCB_cut = obtained_targets_history_UCB[:, initial_sample_size:]\n",
    "obtained_targets_history_PI_cut = obtained_targets_history_PI[:, initial_sample_size:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "interations_number = 10\n",
    "iterations = range(interations_number)\n",
    "\n",
    "lower_percentile = 25 \n",
    "upper_percentile = 75 \n",
    "\n",
    "plt.plot(iterations, np.mean(obtained_targets_history_EI_cut, axis=0), label='EI')\n",
    "plt.fill_between(iterations, \n",
    "                 np.percentile(obtained_targets_history_EI_cut, axis=0, q=lower_percentile),\n",
    "                 np.percentile(obtained_targets_history_EI_cut, axis=0, q=upper_percentile), alpha=0.3)\n",
    "\n",
    "plt.plot(iterations, np.mean(obtained_targets_history_UCB_cut, axis=0), label='UCB')        \n",
    "plt.fill_between(iterations, \n",
    "                 np.percentile(obtained_targets_history_UCB_cut, axis=0, q=lower_percentile),\n",
    "                 np.percentile(obtained_targets_history_UCB_cut, axis=0, q=upper_percentile), alpha=0.3)\n",
    "\n",
    "plt.plot(iterations, np.mean(obtained_targets_history_PI_cut, axis=0), label='PI')\n",
    "plt.fill_between(iterations, \n",
    "                 np.percentile(obtained_targets_history_PI_cut, axis=0, q=lower_percentile),\n",
    "                 np.percentile(obtained_targets_history_PI_cut, axis=0, q=upper_percentile), alpha=0.3)\n",
    "plt.legend(loc='upper right')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem 1.2 \n",
    "\n",
    "Select the best $\\beta$ for UCB acquisition function wrt mean regret."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO select different values of beta to get the best one\n",
    "number_of_runs = 5\n",
    "obtained_targets_history_beta = []\n",
    "beta_space = np.logspace(0.1, 10, 5)\n",
    "for beta in beta_space:\n",
    "    obtained_targets_history_UCB = []\n",
    "    for index in trange(number_of_runs):\n",
    "        seed(index)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO plot results\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 4. Bayesian optimization for the parameters of Gradient boosting\n",
    "\n",
    "Now we optimize hyperparameters for Gradient boosting of decision trees."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from sklearn.model_selection import cross_validate\n",
    "from sklearn.model_selection import ShuffleSplit\n",
    "from sklearn.metrics import roc_auc_score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If lightgbm is not installed, please run the following cell"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "!pip install lightgbm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import lightgbm as lgb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### We predict defaults for the classification problem\n",
    "\n",
    "The goal is to predict if the two years absense of payments occur."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the training sample\n",
    "data = pd.read_csv('training_data.csv')\n",
    "data.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "points = data[['RevolvingUtilizationOfUnsecuredLines', 'age',\n",
    "       'NumberOfTime30-59DaysPastDueNotWorse', 'DebtRatio', 'MonthlyIncome',\n",
    "       'NumberOfOpenCreditLinesAndLoans', 'NumberOfTimes90DaysLate',\n",
    "       'NumberRealEstateLoansOrLines', 'NumberOfTime60-89DaysPastDueNotWorse',\n",
    "       'NumberOfDependents']]\n",
    "targets = data['SeriousDlqin2yrs']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "lgb_ensemble = lgb.LGBMClassifier()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cross_validation_result = cross_validate(lgb_ensemble, \n",
    "                                         points, targets, \n",
    "                                         scoring='roc_auc')\n",
    "print(np.mean(cross_validation_result['test_score']))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Define the region\n",
    "hyperparameters_space = [{'name': 'learning_rate', 'type': 'continuous', 'domain': (0.05, 0.2)},\n",
    "         {'name': 'n_estimators', 'type': 'discrete', 'domain': np.arange(10, 200)},\n",
    "         {'name': 'subsample', 'type': 'continuous', 'domain': (0.75, 1.)}]\n",
    "# We are interested in the following variables:\n",
    "# 'continuous', \n",
    "# 'discrete', \n",
    "# 'categorical',\n",
    "\n",
    "hyperparameters_design_region = GPyOpt.Design_space(space=hyperparameters_space)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def get_cv_quality(model_parameters_list):\n",
    "    r\"\"\"\n",
    "  \n",
    "    Quality of model using given hyperparameters\n",
    "\n",
    "    Inputs\n",
    "    --------\n",
    "    model_parameters : np.array\n",
    "      numpy array of hyperparameteres specified in the same way as a domain\n",
    "\n",
    "    Outputs\n",
    "    --------\n",
    "    minus_roc_auc : float\n",
    "      minus mean value of ROC AUC calculated via cross validation\n",
    "    \"\"\"\n",
    "    test_score_list = []\n",
    "    for model_parameters in model_parameters_list:\n",
    "        classification_model = lgb.LGBMClassifier()\n",
    "\n",
    "        dict_model_parameters = dict(zip([element['name'] for element in hyperparameters_space], \n",
    "                                         model_parameters))\n",
    "      \n",
    "        # transform types to int for discrete variables\n",
    "        for key in dict_model_parameters:\n",
    "            hyperparameter_description = [x for x in hyperparameters_space if x['name'] == key][0]\n",
    "            if hyperparameter_description['type'] == 'discrete':\n",
    "                dict_model_parameters[key] = int(dict_model_parameters[key])\n",
    "            \n",
    "        classification_model.set_params(**dict_model_parameters)\n",
    "        test_score = -np.mean(cross_validate(classification_model, \n",
    "                              points, targets, scoring='roc_auc')['test_score'])\n",
    "        test_score_list.append(test_score)\n",
    "    return test_score"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "get_cv_quality([[]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "get_cv_quality([[0.1, 20., 0.9]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run Bayesian optimization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "hyperparameters_optimization_problem = GPyOpt.methods.BayesianOptimization(get_cv_quality,  # function to optimize       \n",
    "                                          domain=hyperparameters_space,         # box-constrains of the problem\n",
    "                                          acquisition_type='EI')   # Exploration exploitation\n",
    "hyperparameters_optimization_problem.run_optimization(max_iter=50)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(hyperparameters_optimization_problem.Y);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hyperparameters_optimization_problem.plot_convergence()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Validate parameters using a test sample"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Load the testing sample\n",
    "test_data = pd.read_csv('test_data.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "test_points = test_data[['RevolvingUtilizationOfUnsecuredLines', 'age',\n",
    "       'NumberOfTime30-59DaysPastDueNotWorse', 'DebtRatio', 'MonthlyIncome',\n",
    "       'NumberOfOpenCreditLinesAndLoans', 'NumberOfTimes90DaysLate',\n",
    "       'NumberRealEstateLoansOrLines', 'NumberOfTime60-89DaysPastDueNotWorse',\n",
    "       'NumberOfDependents']]\n",
    "test_targets = test_data['SeriousDlqin2yrs']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "initial_model = lgb.LGBMClassifier()\n",
    "initial_model.fit(points, targets)\n",
    "test_predicted_probabilities = initial_model.predict_proba(test_points)[:, 1]\n",
    "print(roc_auc_score(test_targets, test_predicted_probabilities))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hyperparameters_optimization_problem.x_opt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "best_model = lgb.LGBMClassifier(learning_rate=hyperparameters_optimization_problem.x_opt[0], \n",
    "                                n_estimators=int(hyperparameters_optimization_problem.x_opt[1]),\n",
    "                                subsample=hyperparameters_optimization_problem.x_opt[2])\n",
    "best_model.fit(points, targets)\n",
    "test_predicted_probabilities = best_model.predict_proba(test_points)[:, 1]\n",
    "print(roc_auc_score(test_targets, test_predicted_probabilities))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem 2.1 \n",
    "\n",
    "Try to optimize another parameter of Gradient boosting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO define another hyperparameters space and optimise hyperparameters in it\n",
    "hyperparameters_optimization_problem.run_optimization(max_iter=50)\n",
    "hyperparameters_optimization_problem.plot_convergence()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO run test for the best model\n",
    "best_model = # TODO\n",
    "best_model.fit(points, targets)\n",
    "test_predicted_probabilities = best_model.predict_proba(test_points)[:, 1]\n",
    "print(roc_auc_score(test_targets, test_predicted_probabilities))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem 2.2\n",
    "\n",
    "Compare with GridSearchCV"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from sklearn.model_selection import GridSearchCV"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bonus"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optimization details\n",
    "\n",
    "Use optimization algorithm.\n",
    "We use multistart combined with L-BFGS.\n",
    "\n",
    "Multistart procedure:\n",
    "1. Generate an initial sample $x_1, \\ldots, x_n$. Evaluation the acquisation function at each point and get $(a(x_1), \\ldots, a(x_n))$.\n",
    "2. Select $k$ best points.\n",
    "3. Use each point as the initial point for running (L-BFGS) and get $k$.\n",
    "4. Select the best point."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### L-BFGS \n",
    "\n",
    "Quasi-Newton optimization using Taylor series up to the second order\n",
    "$$\n",
    "f(x_k + p) \\approx f(x_k) + \\nabla f^T(x_k) p + \\frac12 p^T \\mathbf{H}p\n",
    "$$\n",
    "$$\n",
    "p = -\\mathbf{H}^{-1}\\nabla f^T(x_k) \\approx -\\mathbf{B}_k^{-1} \\nabla f^T(x_k),\n",
    "$$\n",
    "where $\\mathbf{B}_k$ is an approximation of Hessian $\\mathbf{H}$.\n",
    "\n",
    "We update $\\mathbf{B}_k$ using the following rule:\n",
    "$$\n",
    "\\mathbf{B}_{k + 1} = \\mathbf{B}_k - \\frac{\\mathbf{B}_k s_k s_k^T \\mathbf{B}_k}{s_k^T \\mathbf{B}_k s_k} + \\frac{y_k y_k^T}{y_k^T s_k},\n",
    "$$\n",
    "where $s_k = x_{k + 1} - x_k$, $y_k = \\nabla f(x_{k + 1}) - \\nabla f(x_k)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
